Scalable, Flexible and Active Learning on Distributions
نویسندگان
چکیده
A wide range of machine learning problems, including astronomical inference about galaxy clusters, natural image scene classification, parametric statistical inference, and detection of potentially harmful sources of radiation, can be well-modeled as learning a function on (samples from) distributions. This thesis explores problems in learning such functions via kernel methods, and applies the framework to yield state-of-the-art results in several novel settings. One major challenge with this approach is one of computational efficiency when learning from large numbers of distributions: the computation of typical methods scales between quadratically and cubically, and so they are not amenable to large datasets. As a solution, we investigate approximate embeddings into Euclidean spaces such that inner products in the embedding space approximate kernel values between the source distributions. We provide a greater understanding of the standard existing tool for doing so on Euclidean inputs, random Fourier features. We also present a new embedding for a class of information-theoretic distribution distances, and evaluate it and existing embeddings on several real-world applications. The next challenge is that the choice of distance is important for getting good practical performance, but how to choose a good distance for a given problem is not obvious. We study this problem in the setting of two-sample testing, where we attempt to distinguish two distributions via the maximum mean divergence, and provide a new technique for kernel choice in these settings, including the use of kernels defined by deep learning-type models. In a related problem setting, common to physical observations, autonomous sensing, and electoral polling, we have the following challenge: when observing samples is expensive, but we can choose where we would like to do so, how do we pick where to observe? We give a method for a closely related problem where we search for instances of patterns by making point observations. Throughout, we combine theoretical results with extensive empirical evaluations to increase our understanding of the methods. Acknowledgements To start with, I want to thank my advisor, Jeff Schneider, for so ably guiding me through this long process. When I began my PhD, I didn’t have a particular research agenda or project in mind. After talking to Jeff a few times, we settled on my joining an existing project: applying this crazy idea of distribution kernels to computer vision problems. Obviously, the hunch that this project would turn into something I’d be interested in working on more worked out. Throughout that project and the ones I’ve worked on, Jeff has been instrumental in asking the right questions to help me realize when I’ve gone off in a bad direction, in suggesting better alternatives, and in thinking pragmatically about problems, using the best tools for the job and finding the right balance between empirical and theoretical results. Barnabás Póczos also basically would have been an advisor had he not still been a postdoc when I started my degree. His style is a great complement to Jeff’s, caring about many of the same things but coming at them from a somewhat different direction. Nina Balcan knew the right connections from the theoretical community, which I haven’t fully explored enough yet. I began working in more depth with Arthur Gretton over the past six months or so, and it’s been both very productive and great fun, which is great news since I’m now very excited to move on to a postdoc with him. Several of my labmates have also been key to everything I’ve done here. Liang Xiong helped me get up and running when I started my degree, spending hours in my office showing me how things worked and how to make sense of the results we got. Yifei Ma’s enthusiasm about widely varying research ideas was inspiring. Junier Oliva always knew the right way to think about something when I got stuck. Tzu-Kuo Huang spent an entire summer thinking about distribution learning with me and eating many, many plates of chicken over rice. Roman Garnett is a master of Gaussian processes and appreciated my disappointment in Pittsburgh pizza. I never formally collaborated much with Ben, María, Matt, Samy, Sibi, or Xuezhi, but they were always fun to talk about ideas with. The rest of the Auton Lab, especially Artur Dubrawski, made brainstorming meetings something to look forward to whenever they happened. Outside the lab,MichelleNtampakawas a joy to collaboratewith on applications to cosmology problems, even when she was too embarrassed to show me her code for the experiments. The rest of the regular Astro/Stat/ML group also helped fulfill, or at least feel like I was fulfilling, my high school dreams of learning about the universe. Fish Tung made crazy things work. The xdata crew made perhaps-too-frequent drives to DC and long summer days packed in a small back room worth it, especially frequent Phronesis collaborators Ben Johnson, who always had an interesting problem to think about, and Casey King, who always knew an interesting person to talk to. Karen Widmaier, Deb Cavlovich, and Catherine Copetas made everything run smoothly: without them, not only would nothing ever actually happen, but the things that did happen would be far less pleasant. Jarod Wang and Predrag Punosevac kept the lab machines going, despite my best efforts to crash, overload, or otherwise destroy them. Other, non-research friends also made this whole endeavor worthwhile. Alejandro Carbonara always had jelly beans, Ameya Velingker made multiple spontaneous trips across state lines, Aram Ebtekar single-handedly and possibly permanently destroyed my sleep schedule, Dave Kurokawa was a good friend (there’s no time to explain why), Shayak Sen received beratement with grace, and Sid Jain gave me lots of practice for maybe having a teenager of my own one day. Listing friends in general a futile effort, but here’s a few more of the Pittsburgh people without whom my life would have been worse: Alex, Ashique, Brendan, Charlie, Danny, the Davids, Goran, Jay-Yoon, Jenny, Jesse, John, Jon, Junxing, Karim, Kelvin, Laxman, Nic, Nico, Preeti, Richard, Ryan, Sarah, Shriphani, Vagelis, and Zack, as well as the regular groups for various board/tabletop games, everyone else who actually participated in departmental social events, and machine learning conference buddies. Of course, friendship is less constrained by geography than it once was: Alex Burka, James Smith, and Tom Eisenberg were subjected to frequent all-day conversations, the Board of Shadowy Figures and the PTV Mafia group provided much pleasant distraction, and I spent more time on video chats with Jamie McClintock, Luke Collin, Matt McLaughlin, and Tom McClintock than was probably reasonable. My parents got me here and helped keep me here, and even if my dad thinks I didn’t want him at my defense, I fully acknowledge that all of it is only because of them. My brother Ian and an array of relatives (grandparents, aunts, uncles, cousins, and the more exotic ones like first cousins once removed and great-uncles) are also a regular source of joy, whether I see them several times a year or a handful of times a decade. Thank you.
منابع مشابه
Cost Effective and Scalable Synthesis of MnO2 Doped Graphene in a Carbon Fiber/PVA: Superior Nanocomposite for High Performance Flexible Supercapacitors
In the current study, we report new flexible, free standing and high performance electrodes for electrochemical supercapacitors developed througha scalable but simple and efficient approach. Highly porous structures based on carbon fiber and poly (vinyl alcohol) (PVA) were used as a pattern. The electrochemical performances of Carbon fiber/GO-MnO2/CNT supercapacitors were characteriz...
متن کاملDynamic configuration and collaborative scheduling in supply chains based on scalable multi-agent architecture
Due to diversified and frequently changing demands from customers, technological advances and global competition, manufacturers rely on collaboration with their business partners to share costs, risks and expertise. How to take advantage of advancement of technologies to effectively support operations and create competitive advantage is critical for manufacturers to survive. To respond to these...
متن کاملAuthorization models for secure information sharing: a survey and research agenda
This article presents a survey of authorization models and considers their 'fitness-for-purpose' in facilitating information sharing. Network-supported information sharing is an important technical capability that underpins collaboration in support of dynamic and unpredictable activities such as emergency response, national security, infrastructure protection, supply chain integration and emerg...
متن کاملAMIDST: a Java Toolbox for Scalable Probabilistic Machine Learning
The AMIDST Toolbox is a software for scalable probabilistic machine learning with a special focus on (massive) streaming data. The toolbox supports a flexible modeling language based on probabilistic graphical models with latent variables and temporal dependencies. The specified models can be learnt from large data sets using parallel or distributed implementations of Bayesian learning algorith...
متن کاملH∞ Robust Controller Design and Experimental Analysis of Active Magnetic Bearings with Flexible Rotor System
H∞ controller for active magnetic bearings (AMBs) with flexible rotor system was designed in this paper. The motion equations of AMBs and flexible rotor system are built based on finite element methods (FEM). Weighting function matrices of H∞ controller for AMBs are studied for both the sensitivity and the complementary sensitivity of H∞ control theory. The simulation shows that the H∞ control ...
متن کاملThesis Proposal: Scalable, Active and Flexible Learning on Distributions
A wide range of machine learning problems, including astronomical inference about galaxy clusters, natural image scene classification, parametric statistical inference, and predictions of public opinion, can be well-modeled as learning a function on (samples from) distributions. This thesis explores problems in learning such functions via kernel methods. The first challenge is one of computatio...
متن کامل